A Study of Mispredicted Branches Dependent on Load Misses in Continual Flow Pipelines
نویسندگان
چکیده
Large instruction window processors can achieve high performance by supplying more instructions during long latency load misses, thus effectively hiding these latencies. Continual Flow Pipeline (CFP) architectures provide high-performance by effectively increasing the number of actively executing instructions without increasing the size of the cycle-critical structures. A CFP consists of a Slice Processing Unit which stores missed loads and their forward slice inside a Slice Data Buffer. This makes it possible to open up the resources occupied by these idle instructions to new instructions. In this project, we have designed and implemented CFP on top of Simplescalar. Further, we have compared conventional pipelines to CFPs by running them on various benchmarks in SPEC integer benchmarks suite. We also studied the behavior of mispredicted branches dependent on load misses, which turn out to be the main bottleneck in CFPs. We also compare the performance of CFPs with ideal and non-ideal fetch mechanisms.
منابع مشابه
The Effect of Executing Mispredicted Load Instructions in a Speculative Multithreaded Architecture
Concurrent multithreaded architectures exploit both instructionlevel and thread-level parallelism in application programs. A single-threaded sequencing mechanism needs speculative execution beyond conditional branches in order to exploit more instruction-level parallelism. In addition, an aggressive multithreaded architecture should also use thread-level control speculation in order to exploit ...
متن کاملExploiting the Prefetching Effect Provided by Executing Mispredicted Load Instructions
As the degree of instruction-level parallelism in superscalar architectures increases, the gap between processor and memory performance continues to grow requiring more aggressive techniques to increase the performance of the memory system. We propose a new technique, which is based on the wrong-path execution of loads far beyond instruction fetch-limiting conditional branches, to exploit more ...
متن کاملUsing Incorrect Speculation to Prefetch Data in a Concurrent Multithreaded Processor
Concurrent multithreaded architectures exploit both instruction-level and thread-level parallelism through a combination of branch prediction and thread-level control speculation. The resulting speculative issuing of load instructions in these architectures can significantly impact the performance of the memory hierarchy as the system exploits higher degrees of parallelism. In this study, we in...
متن کاملA Day in the Life of a Data Cache Miss
The activity within a processor following a cache miss is studied via a series of simulation experiments. This is a preliminary step toward developing ways of mitigating data cache miss penalties, especially for long misses. With a modest-sized reorder buffer (ROB) of 64 entries, structural blockages due to a full ROB are the major cause of the cache miss penalty. For the SpecINT2000 benchmarks...
متن کاملA New Load-Flow Method in Distribution Networks based on an Approximation Voltage-Dependent Load model in Extensive Presence of Distributed Generation Sources
Power-flow (PF) solution is a basic and powerful tool in power system analysis. Distribution networks (DNs), compared to transmission systems, have many fundamental distinctions that cause the conventional PF to be ineffective on these networks. This paper presents a new fast and efficient PF method which provides all different models of Distributed Generations (DGs) and their operational modes...
متن کامل